Nonparametric Density Estimation and Clustering with Application to Cosmology
نویسنده
چکیده
We present a clustering method based on nonparametric density estimation. We use Kernel smoothing and orthogonal series estimators to estimate the density f and then we extract the connected components of the level set using a modified Cuevas et al (2000) algorithm. We extend an idea due to Stein (1981) and Beran and Dümbgen (1998) to construct confidence sets for the level set {f > δc} using the asymptotic distribution of loss function. Specifically, we show the stochastic convergence of the pivot process, Bn(λp) = √ n(Lp(λp) − Ŝp(λp)) where Lp(λp) and Sp(λp) are the loss function and the estimated risk function with the smoothing parameter λp. Inverting the pivot provides a confidence set for the coefficient of the orthogonal series estimator and furthermore one can construct a confidence set for functionals of f . We consider applications in astronomy and other fields. Acknowledgment This is joint work with Larry Wasserman, Chris Genovese and Bob Nichol. References[1] Beran, R. and Dümbgen. (1998). Modulation of Estimators and Confidence Sets. Ann.Statist.,26, 1826-1856.[2] Cuevas, A., Febrero, M. and Fraiman, R. (2000). Estimation the number of clusters.The Canadian Journal of Statistics, 28, 367-382.[3] Jang, W. and Wasserman, L. (2003). Confidence Sets for Densities and Clusters. Inpreparation.[4] Stein, C (1981). Estimation of the mean of a multivariate normal distribution. Ann.Statist.,9, 1135-1151.
منابع مشابه
A Fast Clustering Algorithm with Application to Cosmology
We present a fast clustering algorithm for density contour clusters (Hartigan , 1975) that is a modified version of the Cuevas, Febrero and Fraiman (2000) algorithm. By Hartigan’s definition, clusters are the connected components of a level set Sc ≡ {f > c} where f is the probability density function. We use kernel density estimators and orthogonal series estimators to estimate f and modify the...
متن کاملStatistical Topology Using the Nonparametric Density Estimation and Bootstrap Algorithm
This paper presents approximate confidence intervals for each function of parameters in a Banach space based on a bootstrap algorithm. We apply kernel density approach to estimate the persistence landscape. In addition, we evaluate the quality distribution function estimator of random variables using integrated mean square error (IMSE). The results of simulation studies show a significant impro...
متن کاملBayesian Density Regression and Predictor-dependent Clustering
JU-HYUN PARK: Bayesian Density Regression and Predictor-Dependent Clustering. (Under the direction of Dr. David Dunson.) Mixture models are widely used in many application areas, with finite mixtures of Gaussian distributions applied routinely in clustering and density estimation. With the increasing need for a flexible model for predictor-dependent clustering and conditional density estimation...
متن کاملFast Estimation of Nonparametric Kernel Density Through PDDP, and its Application in Texture Synthesis
In thiswork, anewalgorithm isproposed for fast estimationofnonparametricmultivariate kernel density, based on principal direction divisive partitioning (PDDP) of the data space.The goal of the proposed algorithm is to use the finite support property of kernels for fast estimation of density. Compared to earlier approaches, this work explains the need of using boundaries (for partitioning the sp...
متن کاملOn a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification
Pairwise clustering methods partition the data space into clusters by the pairwise similarity between data points. The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class class...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003